NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Contextual AI models for context-specific prediction in biology

https://doi.org/10.1038/s41592-024-02342-2

Li, Michelle; Zitnik, Marinka (August 2024, Nature Methods)

Full Text Available
Contextual AI models for single-cell protein biology

https://doi.org/10.1038/s41592-024-02341-3

Li, Michelle M; Huang, Yepeng; Sumathipala, Marissa; Liang, Man Qing; Valdeolivas, Alberto; Ananthakrishnan, Ashwin N; Liao, Katherine; Marbach, Daniel; Zitnik, Marinka (August 2024, Nature Methods)

Abstract Understanding protein function and developing molecular therapies require deciphering the cell types in which proteins act as well as the interactions between proteins. However, modeling protein interactions across biological contexts remains challenging for existing algorithms. Here we introduce PINNACLE, a geometric deep learning approach that generates context-aware protein representations. Leveraging a multiorgan single-cell atlas,PINNACLElearns on contextualized protein interaction networks to produce 394,760 protein representations from 156 cell type contexts across 24 tissues.PINNACLE’s embedding space reflects cellular and tissue organization, enabling zero-shot retrieval of the tissue hierarchy. Pretrained protein representations can be adapted for downstream tasks: enhancing 3D structure-based representations for resolving immuno-oncological protein interactions, and investigating drugs’ effects across cell types.PINNACLEoutperforms state-of-the-art models in nominating therapeutic targets for rheumatoid arthritis and inflammatory bowel diseases and pinpoints cell type contexts with higher predictive capability than context-free models.PINNACLE’s ability to adjust its outputs on the basis of the context in which it operates paves the way for large-scale context-specific predictions in biology.
more » « less
Full Text Available
Subgraph Neural Networks

Alsentzer, Emily; Finlayson, Samuel; Li, Michelle; Zitnik, Marinka (January 2020, Advances in neural information processing systems)
null (Ed.)
Deep learning methods for graphs achieve remarkable performance on many node-level and graph-level prediction tasks. However, despite the proliferation of the methods and their success, prevailing Graph Neural Networks (GNNs) neglect subgraphs, rendering subgraph prediction tasks challenging to tackle in many impactful applications. Further, subgraph prediction tasks present several unique challenges: subgraphs can have non-trivial internal topology, but also carry a notion of position and external connectivity information relative to the underlying graph in which they exist. Here, we introduce SubGNN, a subgraph neural network to learn disentangled subgraph representations. We propose a novel subgraph routing mechanism that propagates neural messages between the subgraph's components and randomly sampled anchor patches from the underlying graph, yielding highly accurate subgraph representations. SubGNN specifies three channels, each designed to capture a distinct aspect of subgraph topology, and we provide empirical evidence that the channels encode their intended properties. We design a series of new synthetic and real-world subgraph datasets. Empirical results for subgraph classification on eight datasets show that SubGNN achieves considerable performance gains, outperforming strong baseline methods, including node-level and graph-level GNNs, by 19.8% over the strongest baseline. SubGNN performs exceptionally well on challenging biomedical datasets where subgraphs have complex topology and even comprise multiple disconnected components.
more » « less
Full Text Available
The GGCMI Phase 2 experiment: global gridded crop model simulations under uniform changes in CO<sub>2</sub>, temperature, water, and nitrogen levels (protocol version 1.0)

https://doi.org/10.5194/gmd-13-2315-2020

Franke, James A.; Müller, Christoph; Elliott, Joshua; Ruane, Alex C.; Jägermeyr, Jonas; Balkovic, Juraj; Ciais, Philippe; Dury, Marie; Falloon, Pete D.; Folberth, Christian; et al (January 2020, Geoscientific Model Development)

Abstract. Concerns about food security under climate change motivate efforts to better understand future changes in crop yields.Process-based crop models, which represent plant physiological and soil processes, are necessary tools for this purpose since they allow representing future climate and management conditions not sampled in the historical record and new locations to which cultivation may shift.However, process-based crop models differ in many critical details, and their responses to different interacting factors remain only poorly understood.The Global Gridded Crop Model Intercomparison (GGCMI) Phase 2 experiment, an activity of the Agricultural Model Intercomparison and Improvement Project (AgMIP), is designed to provide a systematic parameter sweep focused on climate change factors and their interaction with overall soil fertility, to allow both evaluating model behavior and emulating model responses in impact assessment tools.In this paper we describe the GGCMI Phase 2 experimental protocol and its simulation data archive.A total of 12 crop models simulate five crops with systematic uniform perturbations of historical climate, varying CO2, temperature, water supply, and applied nitrogen (“CTWN”) for rainfed and irrigated agriculture, and a second set of simulations represents a type of adaptation by allowing the adjustment of growing season length.We present some crop yield results to illustrate general characteristics of the simulations and potential uses of the GGCMI Phase 2 archive.For example, in cases without adaptation, modeled yields show robust decreases to warmer temperatures in almost all regions, with a nonlinear dependence that means yields in warmer baseline locations have greater temperature sensitivity.Inter-model uncertainty is qualitatively similar across all the four input dimensions but is largest in high-latitude regions where crops may be grown in the future.
more » « less
Full Text Available

Search for: All records